NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity
نویسندگان
چکیده
The products of many bacterial non-ribosomal peptide synthetases (NRPS) are highly important secondary metabolites, including vancomycin and other antibiotics. The ability to predict substrate specificity of newly detected NRPS Adenylation (A-) domains by genome sequencing efforts is of great importance to identify and annotate new gene clusters that produce secondary metabolites. Prediction of A-domain specificity based on the sequence alone can be achieved through sequence signatures or, more accurately, through machine learning methods. We present an improved predictor, based on previous work (NRPSpredictor), that predicts A-domain specificity using Support Vector Machines on four hierarchical levels, ranging from gross physicochemical properties of an A-domain's substrates down to single amino acid substrates. The three more general levels are predicted with an F-measure better than 0.89 and the most detailed level with an average F-measure of 0.80. We also modeled the applicability domain of our predictor to estimate for new A-domains whether they lie in the applicability domain. Finally, since there are also NRPS that play an important role in natural products chemistry of fungi, such as peptaibols and cephalosporins, we added a predictor for fungal A-domains, which predicts gross physicochemical properties with an F-measure of 0.84. The service is available at http://nrps.informatik.uni-tuebingen.de/.
منابع مشابه
NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases
NRPS-PKS is web-based software for analysing large multi-enzymatic, multi-domain megasynthases that are involved in the biosynthesis of pharmaceutically important natural products such as cyclosporin, rifamycin and erythromycin. NRPS-PKS has been developed based on a comprehensive analysis of the sequence and structural features of several experimentally characterized biosynthetic gene clusters...
متن کاملCharacterization and Engineering of the Adenylation Domain of a NRPS-Like Protein: A Potential Biocatalyst for Aldehyde Generation
The adenylation (A) domain acts as the first "gate-keeper" to ensure the activation and thioesterification of the correct monomer to nonribosomal peptide synthetases (NRPSs). Our understanding of the specificity-conferring code and our ability to engineer A domains are critical for increasing the chemical diversity of nonribosomal peptides (NRPs). We recently discovered a novel NRPS-like protei...
متن کاملSpecificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs)
We present a new support vector machine (SVM)-based approach to predict the substrate specificity of subtypes of a given protein sequence family. We demonstrate the usefulness of this method on the example of aryl acid-activating and amino acid-activating adenylation domains (A domains) of nonribosomal peptide synthetases (NRPS). The residues of gramicidin synthetase A that are 8 A around the s...
متن کاملIdentification of Sare0718 As an Alanine-Activating Adenylation Domain in Marine Actinomycete Salinispora arenicola CNS-205
BACKGROUND Amino acid adenylation domains (A domains) are critical enzymes that dictate the identity of the amino acid building blocks to be incorporated during nonribosomal peptide (NRP) biosynthesis. NRPs represent a large group of valuable natural products that are widely applied in medicine, agriculture, and biochemical research. Salinispora arenicola CNS-205 is a representative strain of t...
متن کاملEnzyme Redesign by SVM
In [1], a new support vector machine (SVM)-based approach was proposed to predict the substrate (adenylation domain, or A domain in short) specificity of subtypes of a given protein sequence family (nonribosomal peptide synthetases, or NRPS in short). Based on the physico-chemical properties of the amino acids, the residues of NRPS were first encoded into vectors in high dimensional feature spa...
متن کامل